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ABSTRACT 

A study examined what trained phoneticians do when 
they are presented with a transcription task to carry out without any 
knowledge of the dialect they are listening to and without any 
explicit phonological theory as a point of departure. The "best" 
tokens of three categories of potential assimilation (full, partial, 
and zero alveolar) were drawn from electropalatograms and recordings 
of sentences and recordings made by a linguistically naive speaker. 
These tokens were then transferred to a test tape and transcribed by 
13 phoneticians, who provided a n. tow transcription, lexical 
identification, and rating by category. Results suggest that 
identification of these tokens as alveolar or otherwise involves an 
extremely complex set of factors, including knowledge of the auditory 
effects of different tongue gestures, differential vowel quality and 
length in different consonantial contexts, and dialect-specific 
allophonic differences. The task of transcription is seen as an 
inherently ambiguous task at several levels, and transcription 
without any kin<? of theory is not recommended. (MSE) 
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ON THE LIMITS OF AUDITORY TRANSCRIPTION: A 
SOCIOPHONKTIC APPROACH* 



Paul Kerswill 

Department of Linguistic Science. University of Reading 

Susan Wright 
Department of Linguistics, University of Cambridge 

L Introductory remarks 

In a way. this paper only obliquely addresst s sociolinguistic issues. But 
we can justify its inclusion in this volume in that it fits into a way of 
thinking that has been characteristic of a number of sociolinguists in the 
last eight or ten years. During this period, sociolinguistics has become 
something of a self-scrutinising subject, in that people have questioned 
not only the methodology but also the linguistic and social theory be- 
hind it. This paper can be seen as a contribution to this discussion. 

However, it intends to do so in a novel way. We will tackle an area 
of socioiinguistic methodology which is rarely discussed; and wcarc go- 
ing to try to show that this is of no less theoretical significance: this is 
the phonetic nature, and linked with that the transcription, of the actual 
sounds uttered by speakers. At first sight, this pecrns to be a purely 
athcoretical problem, a matter of nuts and bolts. After all, transcription 
is something that, with a bit of ear-training, we can all get reasonably 
good at; as such, it's simply a tool of the trade. But in our view that is 
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YORK PAPERS IN LINGUISTICS 14 

not all that transcription is about; it is also part of the theory of 
sociolinguistics. 

There arc two reasons why we think this is so. Firstly, from the 
speakers' point of view, the sounds are what they use to convey complex 
indcxical information. Secondly, from the point of view of the linguists, 
for them to do their transcription they need a phonological theory, how- 
ever rudimentary. Without a theory, they cannot know what kind of de- 
tail to transcribe, and with the wrong theory they will transcribe the 
wrong detail. Towards the end of this paper, we will show what trained 
phoneticians do when they are presented with a transcription task to 
carry out told', without any knowledge of the dialect they arc listening 
to, and without any explicit phonological theory as a point ^ f icparturc. 

In fact, quite a lot of attention has been paid to the LIN*. . ISTIC rep- 
resentation of the variants of phonological variables, notably by 
Knowlcs (1978), Lodge (1986), the Milroys (e.g. J. Miiroy, 1976) and 
Harris (e.g. 1986). Regrettably (for reasons that will become clear), this 
has not gone hand-in-hand with a consideration of what happens during 
the act of transcribing those variants. This will be the central concern of 
this paper. 



2. The importance of phonetic transcription 

Before wc look at the experiments we carried out, we will consider in 
more detail WHY it is important to examine phonetic transcription Wc 
will approach this question from two angles: first, frdii the point of 
view of recent dialectology and sociolinguistics in general; and second, 
from the specific point of viev of a more phonetics-based field which 
can be termed 'sociophonetics* - in particular, our own work on con- 
nected speech processes in local Cambridge English. 



2.1 Dialectology and sociolinguistics 

First, then, some general points about dialectology and sociolinguistics. 
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Wc can start with a rather alarming discovery made a few years ago by 
Peter Trudgill (1983: 31-51). On the basis of the Linguistic Atlas of 
England (Orton ct al., 1978), he drew the map shown in Fig. I, 



FIGURE 1 
Vowel in last' (from Trudgill, 1983) 




This shows the reflexes of Middle English a in the word last, The point 
to note is this. There is a large band across the middle of England where 
the vowc! is [a:] which separates two areas with [a:], one to the south- 
east, the other to the north-east, in Norfolk. Trudgil! was suspicious of 
this transcription, believing a front vowel to be more usual in Norfolk. 
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He noted thai this Norfolk [a:] area (marked with an arrow) in fact cov- 
ers the locations surveyed by one particular field-worker, who quite sim- 
ply 'got it wrong' (1983: 40); the result is a 'field-worker isogloss' (op. 
cit.: 38). The moral here is obvious. 

Sociolinguists and dialectolologists have relied heavily on auditory 
phonetic transcription as a basic analytic^ fool in their investigation of 
variation and sound change. And, as we mentioned, it has been treated by 
them as a pre-theoretical notion, and they have regarded it as a tedious 
but necessary evil. In most of the early studies, little attention was paid 
to the transcription itself, though the precise effect of this failure (if this 
is the appropriate word for it) is hard to assess. There are two important 
issues here. These are, first, the reliability and, second, the validity of 
the transcriptions. First, let us look at reliability: how consistent are 
transcriptions both across transcribers and within transcribers? The more 
significant of these, we think, is within-transcriber variability, since 
most of the transcription is usually done by a single person. The main 
question here is whether or not a transcriber is consistent: will he or she 
transcribe the same token the same way twice? And docs that transcriber 
have a tendency to 'drift' in his or her judgments over a period of time? 
We shall not in this paper have any more to say on the subject of relia- 
bility. We shall be more concerned with the validity of the transcrip- 
tions. Here, the main question is the way in which a transcription re- 
flects (a) articulator facts and (b) auditory impressions. So we might 
like to consider whether there is a consistent bias towards a particular 
transcription in, say, a particular phonological environment, or whether 
manner of articulation influences the perception of place of articulation. 

We mentioned earlier the increasing sophistication of the linguistic 
variable. Just to giw> some idea of how complex a variable can become, 
consider Table I, which shows one linguist's analysis of the variants of 
two vowels in Liverpool: 
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TABLE 1 

/U^/ and fa/ in Liverpool (from Knowlcs, 1978: 85) 



Lax[u,o] before an unstressed 

vowel: 

or 

(a) diphthongize [u t o]: 
or 

(b) front [u]: 

modify VVV to V + glide + V: 
front final [$]: 



sure /U$/ shore h*/ 
Lb 



iu£ lu6 oue 



ue, ?lte 
iue, lue 
uc, vwe 



owe 
oe, Toe 
oue 
owe 



He sees these variants as generated by a set of interacting rules, which 
represent 'the options open to the speaker at different stages in speech 
production, and the way these options can be used to convey sociolin- 
guistic information about the speaker' (1978: 90). Similarly, for the 
consonantal variable (ng), corresponding to the velar nasal in RP sing, 
Knowlcs identifies the following variants, again general by rules (op 
cit.: 86): 

sigg sir) 
supg surj 
slip si:r)9 

Knowies' analysis is multidimensional. This is true also of the 
Milroys' analysis of Belfast vowels. Table 2 (taken from Milroy, 1987: 
124) shows the variables (o) and (e) t as realised in the data for a single 
speaker. Milroy argues that these variants should be analysed in terms of 
three sub-scales: roundness, backness and length. 
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Clearly, if sociolinguists arc going to operate with this amount of 
detail, we need to know something about the reliability and validity of 
fhc transcriptions on which their (usually sophisticated) analyses are 
based. 

TABLE 2 

(o) and (e) in Belfast (from Milroy, 1987: 124) 



(o): g a a: o x 

got shop 
Poly tech 

shop probably job 

pot concentrated of 

vodka God 
bottom 

(e): e e: e e: 

set-up specials rod 

lent went tell 

went ten 
specials 



remember 
twenty 



2.2 Sociophonetics 

If Liverpool causes difficulties for the transcriber, this is even more true 
of a relatively new field of study, which intersects, to a greater or lesser 
extent, with correlational sociolinguistics. This is the growing field of 
sociophonetic research. A recent, though largely descriptive example is 
Lodge's (1986) outline o f the phonetics and phonologies of a number of 
non-standard varieties ot English. In it, he pays special attention to the 
word in connected speech. As in other sociolinguistic studies. Lodge 
uses an auditory transcription, noting quite fine detail. 
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Lodge is not only interested in 'traditional' phonetic variables, but 
also in the range of assimilations, deletions and cpentheses of normal 
connected speech. In our work in Cambridge, we too have focused on 
these phenomena, which we call connected speech processes, or CSPs. 

We will digress a little at this point to say something about the 
background to our Cambridge project, so as to make it plain just why 
we have conducted the transcription experiments we are going to be re- 
porting. Unlike the 'traditional' variables of sociolinguistics, CSPs arc 
in some sense phonetically motivated: that is, their application can be 
explained with reference to the physiology and the dynamics of the vocal 
tract. Our own interest in these phenomena derives from two sources. 
The first is the observation that conditioned sound changes are always 
the result of the fossilisation of CSPs. The second concerns the fact thai 
CSPs tend, despite their 'naturalness', to be to some extent variety-spe- 
cific. This is shown in Dresslcr's work in Vienna (Dressier & Wodak, 
1982) and in the work of one us in Durham (Kerswill, 1987). Dressier 
talks about lenition and fortition rules (i.e. CSPs) which serve to ease 
production (in the case of lenitions) or to ease perception (in tf.e case of 
fortitions). Some of these processes are apparently specific to one or 
other of the two major varieties of German spoken in Vienna: the local 
dialect and standard Austrian German. In Durham, Kerswill observed that 
certain processes usjally described for English jp*>cared to be absent, 
while others not generally found in the literature were present. The two 
clearest examples are those shown in Table 3, overleaf. By combining 
the facts of sound change and the variety-specific nature of CSPs with 
the sociolinguistic axiom that sounds undergoing change arc socioiin- 
guistically salient, we arrived at the basic hypothesis of our study. This 
is, to put it quite simply, that some connected speech processes will 
show social differentiation in a speech community. 
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TABLE 3 

Connected speech processes in Durham (from Kerswili, 1987: 44) 

(1) CSP present in Durham, absent in RP: 

Regressive voicing assimiliation: 

like [g] baims; 
like [g] me; 
each [d3] deputy; 
this [Z] village; 
scraped [d] down; 
what's [dzl gone in, man? 
good chap [b], Jack 

(2) CSP present in RP, absent in Durham: 

Assimilation of place of articulation: 

that pen [dae?t pen] -> [&ae?ppen] 
that cup [5>ae?t kA?p) -> [&ae?kkA?p] 
good pen [god pen] -> [gobpen] 
good car [god ka: 3 -> [gogka:] 

In the Cambridge project, we are looking at a range of processes, 
particularly place assimi)<«aon, /-vocalisation, syllable deletion and 
palatalisation. We are doing this combining the techniques of sociolin- 
guistics with those of experimental phonetics. We arc looking at natural 
speech from a sample of speakers differentiated by social class, sex and 
age. At the same time as looking at social differentiation, we are also 
looking at ihe effects of speech style, particularly speaking rate, as well 
as the more usual style parameter of formality. (Various aspects of the 
project, along with some of our results, are reported in Kerswili, 1985b; 
Wright, 1986; Nolan and Kerswili, 1988; Wright and Kerswili, 1989.) 
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An important part of our hypothesis is that some CSPs will behave 
in a way comparable with 'ordinary* sociolinguistic variables. We also 
hypothesise that some of these sociolinguistically salient CSPs will 
tend towards articulator discreteness: that is, they will apply in an all- 
or-nothing way. They will, in other words, be beginning to show the 
characteristics of fossilisation and subsequent phonologisation. On the 
other hand, non-salient CSPs will be more purely phonetically, or natu- 
rally, motivated, and will be directly sensitive to speaking rate changes. 
As such, we can expect them to te phonetically gradual in their applica- 
tion. We can, then, expect to find varying degrees of partial deletions, 
partial assimilations, residual articulatory gestures, etc. This notion of 
articulator gradualncss would seem to be especially relevant to one par- 
ticular favourite sociolinguistic variable: that of final t or d deletion; yet 
gradualncss docs not appear to have been considered in the context of 
these variables. 

We need, then, to be able to identify this articulatory gradualncss. 
To do this, we carried out an clcctropalatographic study of assimilations. 
Elcctropalatography (EPG) is a technique which allows the dynamic con- 
tact of the tongue against the roof of the mouth to be recorded. The sub- 
ject wears a specially-made acryllic palate in which are embedded 62 elec- 
trodes. A computer records the contact of the tongue with these elec- 
trodes. Fig. 2 shows some typical EPG output. Each 'palatogram' shows 
the degree of lingual contact with the palate during a particular 10 ms 
window; the top row of dots represents electrodes situated along the 
alveolar ridge, the bottom row those at the junction between the hard and 
soft palates. Fig. 2 (overleaf) shows the tongue contacts at the word 
boundaries in utterances where there b a (potential) assimilation of a fi- 
nal d to an initial *, (lb, lib, Hlb) together with 'control' utterances with 
underlying* final g (la, Ha, Ilia). Details of the analysis will be given 
below; but suffice it to say that there is clear evidence here of articulato- 
rily gradualness, shown by the progression from a complete lack of as- 
similation (lb), through a partial assimilation (lib), to a complete assim- 
ilation (IHb). 
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FIGURE 2 

Palatograms showing degrees of assimilation 



T " - „T *"."*"' — £ — (I*) Crai g c ouldn't 

L j ' j ~ j L j ^ j U ^ ^ ^* ^ ^ j ^ j 

L Li Li LhLhL»LrL Li Li Li L j 



(lb) cai d c ouldn' t » ^ ♦ « ( 7*« "* # JT ** ^ "* ** 

(No assimilation) * N M L J Ll tj I J kjj l J & J h J I A I ^ k J 



r • 1^* t\ «\j fc jtjL jL"A jLjLl *(«■) Craig couldn't 

Lri k j tkj Li iwi £j Iuj t2 O L ( i -T* jT-. _T= ~ ~„ -~ ~ 



(lib) laid couldn't ^ 

(Partial assieilation vith residual Li fciB L J k-J fell £i i±J L*i I . J Sfc^Li 
alveolar articulation) 



Li if^ 1 B lM t?i r=I f § hi l3 ST-"* i'Ji °~ < HI *> Ha^astle 

i'Jk'j w'^r^ iS] ~ ~i ~ r ~ r. r. r, ~ ** r - 

s = » -- i: : E " i it 1 1 1 1 i [ » L » L M L < 



(Iilb)badcar — - 1^ »^ tii t-i L J i "J I J k J t . t I ,f, 



— > 



(Total assimilation, no trace of f r^srsjsrs^. . 
alveolar gesture oo palatogra*>) 
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From the point of view of transcription, the relationship between 
articulation and the percept is an extremely important one. This is not 
only true with gradual processes such ju assimilation, but also in tran- 
scription generally. To illustrate this, we can take the potential minimal 
pair shown below. Do these ever merge, as suggested by the transcrip- 
tion given, o! will there always be some articulator or auditory differ- 
ence? 

(ang collector, fan collector -> [f aeq kslekts] 

The question is: docs perception operate phoneme-categorially, and 
classify intermediate forms decisively as (in this example) fan or fang; 
and if so, can we talk in terms of a perceptual boundary lying on a puta- 
tive continuum of alveolar loss? How would this affect a phonetician's 
attempt at transcribing a potential assimilation? The relationship be 
twecn articulation and perception is something that our experiment has 
tried to elucidate. 

Finally, before we consider the experiment, we shall raise an issue 
that is well known, but still not sufficiently discussed: this is the likeli- 
hood that a segmental transcription predisposes the phonetician to tran- 
scril>c a series of discrete articulations, whereas we know that articula- 
tions blend and overlap in a complex way. It is true that a transcription 
can record double articulations, partially overlapping articulations, and 
even the spread of a feature, such as nasalisation, over more than one 
segment. Despite this, the segments do get transcribed in sequence. 
Moreover, and this is important from our point of view, the segments 
tend to get transcribed in an all-or-nothing way. All this predisposes the 
transcriber to hear a scries of discrete, completely articulated segments. 

3. The experiment 

This experiment explores the relationship between auditory phonetic 
transcription and some aspects of articulatory fact by comparing tran- 
scriptions of potential assimilations with EPG records of the same to- 
kens. 
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On the basis of earlier experiments, we decided to use as categories 
three 'degrees' of assimilation. These are associated with three different 
scores, and correspond to the categories shown in Fig. 2, above. The 
categories, or EPG conditions, can be more explicitly dcfL xt as follows: 

1. Full alveolar: the EPG record shows a complete alveolar closure 

at some point during the articulation. 

2. Partial alveolar (residual alveolar gesture): the record shows more 

lateral and/or alveolar contact than the non-assimilating environ- 
ment, but nonetheless shows no complete closure at any point 
during the articulation. 

3. Zero alveolar (complete assimilation), the record is cither identical 

with the non-assimilating environment, or else shows less lateral 
and/or alveolar contact than it. 

For reasons which will become clear below, we added a fourth EPG con- 
dition: 

4. Non-alveolar (underlying velar or bilabial). 

We then made a list of sentences containing possible word-final as- 
similations of /d/ to a following velar or bilabial, together with 'control* 
sentences with underlying velars and bilabials. (We did not include A/: 
final /t/ 13 normally realised as a pure glottal stop in man)' varieties of 
English, particularly preconsonantally, as in these cxz nplcs. Our paral- 
lel study of final /n/ will be reported elsewhere, Wright and Kcrswill, in 
prep.) The assimilation 'sites' and their controls are given below: 



Assimilation site 



Control 



d+k 
d+k 
d+k 



road collapsed 
Byid concert 
fad catch 
did gai dens 



rogue collapsed 
Berg concert 
fag catch 
dig gardens 



d+g 
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<3+g 
d+m 



bed girls 
lead got 
bride must 



beg girls 
leg got 
bribe must 



We got a phonetician to make EPG and audio recordings of these sen- 
tences. The ones with underlying alveolars were recorded with each of 
the three 'degrees' of assimilation: with fuil alveolar articulation, with 
partial alveolar closure, and with no alveolar closure. The control utter- 
ances were also recorded. This gave us tokens of our four 'EPG condi- 
tions' - three underlyingly alveolar, one underlyingly velar or bilabial. In 
all the tokens, any hint of an audible release was avoided. In order to par- 
tially guard against any unreprcscntativeness in the production by the 
phonetician, we compared his EPG records with those produced ly a lin- 
guistically naive speaker in an earlier experiment. On the basis of this 
comparison, we picked out the *bcst' tokens of each category for use in 
the listening test. 

The tokens were transferred to a test tape in such a way that the 
'control' member of each sentence pair occurred four times and each of 
the three degrees of assimilation for the underlying alveolars occurred 
twice each. This gave us a tape on which one-third of the tokens were 
control items. They were ordered such that identical sentences and 
'articulation types' were not adjacent. Thirteen other phoneticians then 
acted as subjects. Their task was to provide the following: 

- a narrow transcription (of preceding vowel and consonant assimila- 
tion site) 

- lexical identification (judgment of underlying final /<j.b/ *'s. un- 
derlying /d/) 

- rating of words judged to end in an alveolar as having: 

full alveolar contact, 
partial alveolar contact, or 
zero alveolar contact 

In this way, wc hoped to be able to see what criteria, if any, the tran- 
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scribcrs used in deciding between the various degrees of assimilation and 
the non-alveolar control environments. 



3.2 Articulator? gradualness reflected in identifications 

The results of the identification part of ihe task were as shown in Fig. 3, 
which gives the phoneticians' judgments of the tokens as undcrlyingly 
alveolar. Tokens which were articulated with a complete alveolar closure 
('EPG condiiion 1') were almost consistently identified as alveolar. The 
percentage of alveolar identifications rapidly drops across the other three 
EPG conditions - partial alveolar, completely assimilated ('zero') alveolar 
and undcrlyingly velar/bilabial. 

As we would expect, a good deal of 'alveolarity' seems also to be 
cued by the auditory impression made by the partially assimilated tokens 
(condiiion 2). However, perhaps the most interesting results concern 
conditions 3 and 4, both of which show substantial alveolar scores. 
Before attempting to interpret the scores for these two conditions, we 
should first ask why any of the condition 4 tokens should be rated as 
alveolai at all. Three factors should be noted: (1) any 'error' will raise, 
not lower the score; (2) we can expect listeners to try to ficar' alveolarity 
even when there is none; and (3) due to redundancy, some phonetic inde- 
terminacy is tolerated in natural speech; here, in the absence of rcdi v 
dancy, the phonetic indeterminacy becomes critical. 

Condition 3 tokens arc identified as alveolar more frequently than 
condition 4 tokens, which, according to the EPG record, are completely 
assimilated. This evidence suggests that there is, in many if not all of 
the condition 3 tokens, some kind of artkulatory 'residue' which is hav- 
ing acoustic consequences without leaving a trace in the EPG record. 
The question then arises: what is the nature of this acoustic cue, and 
how do phoneticians set about exploiting it in a phonetic transcription? 
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FIGURE 3 

Percentage alveolar identifications for four EPG conditions 

100 



80 



60 



40 



20 



/d/Jodgments 



fuiiAV 
1 



part/d/ 
2 



3 



/9/,/b/or/m/ 
4 



We will look first to sec if EPG can give us any indications as to 
what these cues arc. Remember that a residual alveolar contact shows up 
as lateral contact and partial alveolar contact. However, looking at some 
of the tokens that we originally classed as 'completely assimilated', we 
note something peculiar. This shown in Fig. 4 (overleaf). Note how, in 
these pairs, it is the assimilated alveolar that has the lesser lateral con- 
tact and the more retracted velar articulation. This is intuitively unex- 
pected. But look at the identification scores for these three items (Table 
4, overleaf). 

For two of these three pairs (lead/leg and bed/beg), the difference be- 
tween the scores for the two types is very much greater than for all the 
pairs taken together; as Table 4 shows, this is not true of any other sin 
gle pair. There is, therefore, something differentiating these pairs rather 
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FIGURE 4 

Palaiograms and spectrograms showing alveolars and velars 



him 

dtp. 




did 



1 lUUUUUUUUUuu 



(g) (dig) 



In 1 1 1 1 ii it £f alveolar /d/ 



ft„J tt IMl t{ y m ,t) t? „,i 4 tU*4 Hunt NmhJ ImU wJ » 




t jijLjUUUUUUULIIu 

Il-jI Ilj! Ilj! 



if alveolar /ov 
i j i J L j u J &J ImJ lJ b-J i„J Cad) 



till UsJbJlJllJlULULJuJkJ^Hl 
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(ffolveolar) 



bffl bed 

(0 alveolar) 



<?) (beg) 



*» fur >ci ■»,. 



i J luJ kJ U laJ itul bl J bJ «J 

r alveolar 
(bed) 
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clearly. This is evidently not alveolar contact The EPG patterns of Fig. 
4 can in fact be taken as evidence of a residual tongue body configuration 
appropriate for an alveoiar: as the tongue lip moves up towards the alve- 
olar ridge, the blade and pre-dorsum become concave; this reduces the 
amount of lateral contact in the prc-velar area. At the same time, this 
tongue shape will cause the velar contact itself to be more retracted. 
Some support for this interpretation is provided by spectrographs data 

TABLE 4 

Identification scores for individual tokens - EPG conditions 3 and 4 

EPG CONDITION 3 

Identifications as: 





Alveolar 


Non-alveolar 


%alvcolar 


did/dig 


16 


12 


57 


lcad/Icg 


15 


13 


54 


bed/beg 


16 


12 


57 


road/rogue 


16 


12 


57 


Byrd/Bcrg 


12 


16 


43 


fad/fag 


8 


20 


29 


bridc/bribc 


15 


13 


54 



EPG CONDITION 4 

Identifications as: 

Difference be- 
tween conditions 
Alveolar Non-alveolar %alvco!ar 3 & 4 



did/dig 


25 


31 


45 


12 


lcad/Icg 


8 


48 


14 


M 


bed/beg 


12 


44 


21 


2A 


road/rogue 


28 


28 


50 


7 


Byrd/Bcrg 


27 


29 


48 


-5 


f ad/fag 


8 


48 


14 


15 


bridc/bribc 


24 


32 


4;, 


11 
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for at least one of these pairs: in Fig. 4, the locus of F2 and F3 is 
higher for dig than for did, which suggests velar and alveolar offsets, re- 
spectively. This lingual configuration may in fact be heard as 
'alveolarity'. This is the reason why the more retracted articulation is 
heard as more alveolar: it is the overall configuration of the tongue that 
has the acoustic consequences. 



3.3 Transcription strategies - a mixed bag 

EPG gives us, then, a clue as to the articulatory correlates of assimila- 
tion. In another paper (Wright & Kcrswill, 1989), we have argued that 
this data suggests that there may be no such thing as 'complete 1 assimi- 
lation: there is always some articulatory 'residue 1 in 'maximally 1 assimi- 
lated items. However, here we shall look in some detail at how tran- 
scribers set about rationalising and reducing to symbols the differences 
they have heard. Table 5 shows the the transcriptions of condition 4 and 
condition 3 tokens of Che three pairs just mentioned. (We have included 
only those transcriptions where (a) condition 4 tokens were correctly 
identified as velars, and (b) condition 3 tokens were correctly identified as 
alveolars and judged as having either partial or ztro (but not fu!l) alveo- 
lar articulation.) A striking overall pattern is the high frequency with 
which condition 3 is 'heard' as a partial alveolar rather than as the 
'correct' zero alveolar. This should not surprise us, since once tran- 
scribers have decided they arc listening to an alveolar, they will presum- 
ably try to indicate some sort of alveolarity in the transcription. 

It is more interesting, however, to try to establish the strategies 
transcribers use to differentiate the velar and the alveolar tokens, a nd thc*,n 
to try to match these with the acoustic and articulatory data. An inspec- 
tion of Table 5 shows there is much individual variation. However, three 
strategies seem to recur these involve marking differences in vowel or 
consonant length, differences in vowel quality, and consonantal dilig- 
ences. We will discuss these in turn. 
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TABLE 5 

Transcriptions of tokens of three pairs of items 

NOTE: only those transcriptions have been included where (a) condition 
4 tokens were correctly identified as velars, and (b) condition 3 tokens 
were correctly identified as alveolars and judged as having either zero 03") 
or partial (T) (but not full) alveolar articulation. 



Item: 



Transcriber 



1 did/dig A 
EFG Judgm 



4/g/ 
3/d/ 
3/d/ 



4/g/ 
3/d/ 
3/d/ 



4 

3 
2 



4 
3 
2 



2 lead/leg 

4/g/ 4 



3/d/ 
3/d/ 



4/g/ 
3/d/ 
3/cV 



3 
2 



4 
3 
2 



vg n 



vd J(2) 
H 




ig 
ig 



< ^g (2)/ 

: s 

' v r(d)g'i(d)g 

A B 

eg n (3) ^ (1) 



l g (4) 6;9+ i;g (3)\ 
l 9 Ug i;g I 
wic, \ / 
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4/g/ 4 
3/d/ 3 
3/d/ 2 



3 bed/bej 
4/g/ 

3/d/ 

3/d/ 




D F_ 

?g (4) /^g'^g 



4/g/ 
3/d/ 
3/d/ 



4 
3 
P 




vowel quality difference 



length drflerenco 



Vowel and consonant length 

In five cases (enclosed in the table by a broken line), transcribers mark 
length differences. In three of these cases, the alveolar is heard as being 
preceded by a longer vowel, while in the other iwo the velar is given a 
longer consonant closure. Surprisingly, there is no evidence at all of 
longer vowels in bed and did than in their velar counterparts (see Fig. 4); 
yet for lead, whose vowe! is measurably longer, no transcribers indicate 
this length. Consonant length differences (enclosed by a continuous line) 
can perhaps be seen as the other side of the same coin: a consonant after 
a durationally short vowel may be auditorily longer than after a longer 
vowel. If this is so, it is no less 'correct' to indicate a long consonant 
than to indicate a short vowel 

All five cases of length difference seem, then, to point in the same 
direction. However, there is disagreement between the transcribers as to 
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where this length resides. And where there is a clear vowel duration dif- 
ference, this is apparently not heard as such; conversely, when thwrc is 
no measurable difference in vowel duration, some transcribers seem io 
want to mark one. Whether or not there are consonant duration differ- 
ences will have to await further spectrographs analysis. But for the 
moment, how can we explain the evident mismatch between measurable 
vowel durations and the transcriptions? As linguists, phoneticians 
'know* about allophonic vowel duration differences, and it may be that 
they are trying to Tiear* such a difference - even though none is predicted 
phonologically (both /g/ and /d/ are voiced). Indicating length may be a 
more or less conscious attempt to rationalise a difference they can hear, 
using the limited resources of the IPA - one of which is to mark length. 
Alternatively, the percept of a length difference may be the psycho- 
acoustic correlate of a consistent phonetic difference. As such, the per- 
cept is Veal* in the s^nse that it is not the consequence of an attempt to 
mark a difference willy-nilly, as in the case of the first explanation. Both 
explanations may have an clement of truth in them: the fact that the de- 
ferences marked by the transcribers are consistent with each other sug- 
gests a 'real' perceptual difference, while the disagreement as to where the 
difference lies suggests ad hoc attempts to indicate it using the transcrip- 
tion resources available. 

Vowel quality differences 

In twelve cases, we find vowel quality differences. In nine, the vowc! be- 
fore the velar is heard as closer than that before the alveolar; in only one 
case is the opposite true. Inspection of the spectrograms in Fig. 4 does 
not reveal any decisive differences; however, 'reading' vowel quality from 
the rapidly changing patterns on a spectrogram is notoriously difficult. 
There is obviously considerable agreement among the transcribers; even 
so, we must question the validity of their transcriptions because of the 
influence of their assumed prior "knowledge 1 that closer allophones of 
vowels occur before velars. To lest this source of error, we would need 
to carry out perception tasks using synthetic stimuli, or using edited 
natural stimuli from the which vowel offsets have been removed. 
However, the strength of the agreement certainly suggests the preserva- 
tion of allophonic height differences even after the final consonant has 
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been apparently assimilated. 
Consonant differences 

In most of the cases, the transcribers note consonantal differences. This 
is particularly true, of course, where the transcriber has judged the alveo- 
lar as having a 'partial 1 articulation. There is a multiplicity of transcrip- 
tion strategies, suggesting that it is in the transcription of the consonant 
that the IPA itself fares worst. Strategies include using: 

'no release' diacritic 
'voicclessncss' diacritic 
'double articulation' diacritic 
'retraction' diacritic 
'fronting' diacritic 
'length' diacritic 
'shortness' diacritic 
'lowering' diacritic 
parentheses 
superscripts 

Some of these can be interpreted as representing the sar«ie intention on 
the part of the transcriber, though some can be taken simply to mean 
uncertainty (especially parentheses and superscripts). It is quite clear, 
however, that, unlike in the case of the vowels, the transcribers are ex- 
plicitly aiming to represent articulation rather than, say, an abstract audi- 
tory parameter that might be labelled 'alveolarity'. The success of their 
enterprise will depend at the very least on (a) their ability to discriminate 
without being influenced by their phonological knowledge; (b) their ex- 
perience with transcription; and (c) their knowledge of articulator pho- 
netics. To this must, of course, be added their degree of commitment to 
the task. 

In representing what they hear for the consonants, the transcribers 
are constrained by the segmental nature of the IPA, and the relative diffi- 
culty of indicating phonetic features which change gradually over time 
and which are spread over more than one 'segment'. 
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4. Discussion 

We think the identification of these tokens as alveolar or otherwise in- 
volves an extremely complex set of factors. FirsJy, the listener must 
have knowledge of the auditory effects of different tongue gestures, in- 
cluding the 'residual 1 ones we arc hypothesising. Secondly, as is well 
known, vowel quality and vowel length vary in different consonantal 
contexts; it is likely that these differerences remain even after so-called 
assimilation has taken place, and continue to cue alveolarity. Lastly, an 
important part of these allophonic differences in vowels is that, in spite 
of certain universal tendencies, they are to a great extent dialect-specific, 
and the listener needs to have knowledge of the dialect (in this case, the 
speaker had mild south Yorkshire accent), and even knowledge of the 
speaker himself, to be able to unravel all these effects in such a way as 
to utilise them. 

Transcription is a messy thing. For some people in this study, it is 
a way of representing a sequence of segments which arc either articulato- 
rs complete or non-existent - as some of the transcriptions show. 
Others seem more willing to allow incomplete or overlapping segments, 
but are still bound by the notion of articulatory segments. Yet others 
transcribe vowel quality differences. But we still don't know whether the 
vowel differences are due to residual articulatory gestures, or whether 
they are phonological! y-determined, accent-specific allophonic differences 
that remain even where there is no residual articulatory gesture. In some 
cases, the transcribers could even, consciously or unconsciously, be tun- 
ing in to formant transitions which are not normally considered part of 
vowel quality and which arc certainly not considered part of a phonologi- 
cal analysis. 

To sum up, the problem lies in an inherent multi-layered ambiguity 
in the task of transcription itself. First, transcription is either meant to 
represent articulations, or it is meant to represent auditory impressions. 
Second, it either represents discrete segments, in which case it presup- 
poses a prior phonological analysis, or it represents a continuously vary- 
ing acoustic signal. Lastly, the continuous nature of the acoustic signal 
is either the result of pure, universal coarticulation or it is the result of 
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accent-specific allophonic and sandhi rules. The snag is, all these things 
are tnic to different degrees, and unfortunately transcribers will put the 
boundary between each of the pairs of opposites in different places. This 
is what we meant when we said at the outset that transcribing without 
any kind of theory is a dangerous thing: we simply do not know exactly 
what each individual is doing, and consequently we cannot interpret pre- 
cisely what they write down. 



REFERENCES 

Dressier, W U and Wodak, R. 1982. Sociophonological methods in the study 
of sociolinguistic variation in Viennese German. Language in Society 
11.339-370. 

Harris, J. 1986. Phonetic constraints on sociolinguistic variation. Sheffield 

Working Papers in Language and Linguistics 3.120-43. 
Kcrswill, P.E. 1984. Social and linguistic aspects of Durham e. Journal of 

the International Phonetic Association 14.1 13-34. 
Kerswill, P.E. 1985a. A sociolinguistic study of rural immigrants in Bergen, 

Norway. Unpublished Ph.D. dissertation. University of Cambridge. 
Kerswill, P.E. 1985b. A sociophonetic study of connected speech processes 

in Cambridge English: an outline and some results. Cambridge Papers 

in Phonetics and Experimental Linguistics 4. 
Kerswill, P.E. 1987. Levels of linguistic variation in Durham. Journal of 

Linguistics 23.25-49. 
Knowles, G.O. 1978. The nature of phonological variables in Scouse. In 

Trudgill, P. ed. 1978. Sociolinguistic Patterns in British English. 

London: Arnold. 

Lodge, K. 1986. Studies in the Phonology of Colloquial Englishes. London: 
Crooin Helm. 

Milroy, J. 1976. Length and height variations in the vowels of Belfast 
vernacular. Belfast Working Papers in Language and Linguistics 13. 

Milroy, L. 1980. Language and Social Networks. Oxford: Blackwell. 

Milroy, L. 1987. Observing and Analyzing Natural Language. Oxford: 
Blackwell. 

Nolan, F » and Kerswill, P.F. 1989. The description of connected speech 
processes. In S. Ramsaran. ed. Essays in Honour of A.C. Gimson. 



58 



2B 



THE LIMITS OF AUDITOR Y TRANSCRIPTION 



London: Routlcdgc. 
Orton, H., HalHda' , W.J., Sanderson, S. f Tilling, P.M., Wakclin. M.F. and 
Wright, N. 1962-7. Survey of English Dialects Vols. I-IV. Leeds: E.J. 



Orton, H., Sanderson, S. and Widdowson, J. 1978. Linguistic Atlas of 

England. London: Croom Helm. 
Rornaine, S. ed. 1982. Sociolinguistic Variation in Speech Communities. 

London: Arnold. 
Tnidgill, P. 1983. On Dialect. Oxford Blackwell. 

Wright, S. 1986. The interaction of sociolinguistic and phonetically- 
conditioned CSPs in Cambriuge English: auditory and 
electropalalographic evidence. Cambridge Papers in Phonetics and 
Experimental Linguistics 5. 

Wright, S. and Kerswill, P. E. 1988a. On the perception of connected speech 
processes. Paper given at the Linguistics Association of Great Britain 
meeting. University of Durham. March 1988. 

Wright, S. and Kerswill, P. E. 1988b. EPG in the analysis of connected 
speech processes. Paper given at the First National Symposium on the 
Clinical Applications of Electropalatography. University of Reading. 
March 1988. 

Wright, S. and Kerswill, P. E. 1988c. On the perception of connected speech 
processes. Paoer given at the Colloquium of the Bnush Association of 
Academic Phoneticians, Trinity College. Dublin. March 1988. 

Wright, S. and Kerswill, P.E. 1989. Electropalatography in the analysis of 
connected speech processes. Clinical Linguistics and Phonetics 3. 

Wright, S. and Kerswill, P.E. in preparation. On the perception of connected 
speech processes. 



Arnold. 





